\begin{abstract}
  In Web search, a user refines his search several times before he
  finds the information he needs. It is very likely that, in the
  search log, similar sequences of searches appear many times, as many
  users had searched the Web with the same intent.  Precisely
  interpreting the intent of the user is difficult, even with the help
  of the search log: there might be numerous instances of such intent
  scattering in small pieces in the log, but none of them is
  comprehensive enough to describe the concept precisely.  This
  scenario occurs in many applications. For example, patterns in Web
  search, Internet traffic, program execution traces, network events,
  etc., are often non-stationary, yet the same patterns recur over
  time.  In this paper, we argue that observable patterns are
  generated by hidden models, and precisely characterizing such models
  is only possible if we cluster as much data generated by such models
  as possible and learn from the clustered data as a whole, instead of
  learning from a single episode of such model.  The benefits is
  obvious as it enables us not only to better understand the
  underlying system that generates the data, but also to recognize
  future instance of a model as soon as it occurs. To achieve this, we
  introduce a clustering based approach, where we adopt a novel
  clustering criterion, validation error minimization, to ensure that
  the found models are unique and precise.  We propose a two step
  algorithm, which uses enhanced dynamic programming and EM like
  methods for clustering.  Experiments show that in benchmark
  datasets, our approach achieves the highest accuracy with lowest
  cost in comparison with the current best approaches.
\end{abstract}

%% inten intentIn this paper, we are interested in
%% finding accurate description of such ``intentions'' or concepts in
%% historical data.  
%% State-of-the-art approaches are insufficient for this purpose as they
%% keep learning or updating models from the evolving data, and use these
%% impromptu models for online prediction.  In many cases, this proves to
%% be both costly and ineffective --- much time is wasted on re-learning
%% recurring concepts, yet the classifier may remain one step behind the
%% current trend all the time. In this paper, we propose a approach which
%% finds unique concepts in the evolving data by clustering. 
%%  At run time, combining historical concept change patterns
%% and cues provided by an online training stream, we find the most
%% likely current concept and use its corresponding models to classify
%% data in an unlabeled stream.  The primary advantage of the high-order
%% model approach is its high accuracy. Another
%% important benefit is that, unlike state-of-the-art approaches, 
